Learning a model of speaker head nods using gesture corpora
نویسندگان
چکیده
During face-to-face conversation, the head is constantly in motion, especially during speaking turns [2]. These movements are not random; research has identified a number of important functions served by head movements [7] [5] [3] [4]. Head movements provide a range of information in addition to the verbal channel such as nods to show our agreement or shakes to express disbelief. The goal of our work is to build a domain-independent model of speaker’s head movements and use the model to generate head movements for virtual agents. To use the model for interactive virtual agents, it needs to operate in real-time. For this reason, we focus on features that are readily available at the time head movements are generated. In addition, we plan to make the model portable to other systems by using features such as part of speech tags that are easily obtainable even when using different language tools. In this paper, we present a data-driven, automated approach to generate speaker nonverbal behavior, which we demonstrate and evaluate by learning when head nods should occur. Specifically, the approach uses a machine-learning technique (i.e. learning a hidden Markov model [8]) to create a head nod model from annotated corpora of face-to-face human interaction, relying on the linguistic features of the surface text. Figure 1 illustrates the overview of the procedures to learn the model. Once the patterns of when people nod are learned, then it can be used to generate head nods for virtual agents by encoding a new sample with the factors used for learning and feeding it to the model to obtain the most likely head movement.
منابع مشابه
On the temporal domain of co-speech gestures: syllable, phrase or talk spurt?
This study explores the use of automatic methods to detect and extract hand gesture movement co-occuring with speech. Two spontaneous dyadic dialogues were analyzed using 3D motion-capture techniques to track hand movement. Automatic speech/non-speech detection was performed on the dialogues resulting in a series of connected talk spurts for each speaker. Temporal synchrony of onset and offset ...
متن کاملEvaluating models of speaker head nods for virtual agents
Virtual human research has often modeled nonverbal behaviors based on the findings of psychological research. In recent years, however, there have been growing efforts to use automated, data-driven approaches to find patterns of nonverbal behaviors in video corpora and even thereby discover new factors that have not been previously documented. However, there have been few studies that compare h...
متن کاملReal Time Head Gesture Recognition in Affective Interfaces
In this paper we present the affective message box, a dialog box that employs a real time head gesture recognition system as its input modality. Head nods and shakes correspond to “Yes/No” options on the dialog box. In addition, a confidence measure is inferred from a number of parameters extracted from gesture’s temporal patterns. While in current dialog boxes the input is either a definite ye...
متن کاملFeedback in Nordic First-Encounters: a Comparative Study
The paper compares how feedback is expressed via speech and head movements in comparable corpora of first encounters in three Nordic languages: Danish, Finnish and Swedish. The three corpora have been collected following common guidelines, and they have been annotated according to the same scheme in the NOMCO project. The results of the comparison show that in this data the most frequent feedba...
متن کاملPredicting Listener Backchannels: A Probabilistic Multimodal Approach
During face-to-face interactions, listeners use backchannel feedback such as head nods as a signal to the speaker that the communication is working and that they should continue speaking. Predicting these backchannel opportunities is an important milestone for building engaging and natural virtual humans. In this paper we show how sequential probabilistic models (e.g., Hidden Markov Model (HMM)...
متن کامل